Crowdsourced audit of Twitter’s recommender systems

This research conducts an audit of Twitter’s recommender system, aiming to examine the disparities between users’ curated timelines and their subscription choices. Through the combined use of a browser extension and data collection via the Twitter API, our investigation reveals a high amplification of friends from the same community, a preference for amplifying emotionally charged and toxic tweets and an uneven algorithmic amplification across friends’ political leaning. This audit emphasizes the importance of transparency, and increased awareness regarding the impact of algorithmic curation.


Participants Statistics
A. Demographic.The proportions of participants in different age groups are as follows: 16.5% of the population is aged ≤ 25, 27.0% falls within the 25-34 age range, 37.8% are aged 35-49, and the remaining 18.8% are ≥ 50 years old.86.7% of the participants declared themselves as "man", 13.3% as "woman" and less than 0.1% as other.
B. Twitter Usage.During the data collection timeframe, the participants connected to Twitter on their desktop on average 5 times a day ( [1,15] 5-95 percentiles), for a median session length of 3 minutes ([0.8, 35] 5-95 percentiles).During each session, the participants read on average 30 tweets, see the distribution on Figure S1.
C. Political Representativeness.To assess the representativeness of our participant cohort with respect to the political leanings of the accounts they follow, we conducted the following methodology: We randomly selected French Twitter accounts from the follower network and determined their political leaning (far-left, left, or center) based on the political leanings of their friends, as for our participants.Subsequently, we computed the political leaning distribution of the friends of these accounts.Following this, we calculated the Wasserstein distance between the overall political leaning distribution and the distribution generated by the political leanings of friends of a random subset of users (matching the cardinality of participants) of a given label, either far-left, left, or center.The Wasserstein distance has been computed taking into account the periodicity in the opinion space at ±1.Finally, we determined the Wasserstein distance between the overall political opinion distribution and the distribution of political leaning of participants' friends, restricted to participants of a specific leaning.Overall, we fail to reject the hypothesis according to which the distribution of political leaning among participants' friends significantly differs from the one derived from a random sampling of French Twitter users with corresponding political leanings.

Effect of incomplete data collection
To rule out the potential influence of incomplete data collection on the observed disparity between content displayed in participants' timelines and their friends' posts, we conducted tests on the distribution of key variables: followers count, tweet count, and political leanings.We compared these distributions between friends whose data was and wasn't collected.
Regarding followers count and tweet count, we employed chi-square tests to assess frequency differences in respective bins between the two groups.Our findings indicate that, at a significance level of 0.05, 94.5% (95.5%) of participants exhibited no statistically significant disparities in their followers count (tweet count) between fetched and non-fetched friends.
For the continuous variable of political leaning, we performed a two-sample Kolmogorov-Smirnov test to compare the political opinion distributions of fetched and non-fetched friends.In 79.4% of cases, at a significance level of 0.05, no statistically significant differences were observed between the political leanings of fetched and non-fetched friends.
In each case, we recalculated the algorithmic amplification specifically for participants where no statistical differences were found between fetched and non-fetched friends.The results presented in the main text remain consistent and unaffected by this analysis.

Political Determination
The estimation of political orientations was conducted using the Politoscope database as follow: Firstly, we initialized the opinion values of the far-left (Jean-Luc Mélenchon) and far-right (Marine Le Pen) leaders as ±0.75, (arbitrarily chosen value).Subsequently, the opinion of the centrist leader (Emmanuel Macron) was determined by taking the average of the opinions of the two anchored leaders, weighted by the angular similarities between the nodes' embeddings, obtained using node2vec (1).Interestingly, this calculation resulted in an opinion value close to zero (-0.02).The angular similarity is the complement of the angular distance which contrary to the cosine similarity is a formal distance metric: angular similarity = 1 − arccos(cosine similarity) π .Then, for each Twitter account, we computed the angular similarity between the account's embedding and the embeddings of the three leaders.The political leaning of the account was then determined by averaging the opinions of the two closest leaders, considering their angular similarities as weights.In cases where the two closest leaders were the extreme ones, we accounted for the periodicity of the opinion space, ensuring that the assigned opinion spanned the entire range from -1 to +1.Furthermore, we only assigned a political leaning if the angular similarity with the closest leader was at least 10% higher than the similarity with the farther leader.Accounts without a clearly defined political leaning constituted less than 10% of the accounts in our database.
To validate the stability of the resulting opinion scale, we experimented different anchors, i.e. different leaders and different anchors values, leading to the similar assignments.Additionally, we observed a correspondence between the opinion scale and a cluster analysis of the retweet graph (2).For a visual representation of the political landscape, Figure S2 presents the spatialized retweet graph, where nodes are color-coded based on their assigned numerical opinion, providing a clear interpretation of the political landscape.Moreover, the political groups declared by French members of Parliament aligned with the opinion scale, as illustrated in Figure S3.The Pearson correlation coefficient between the political leaning assigned to MPs (averaged by political group) and the left-right overall ideological stance of the parties, assessed by political experts within the 2019 Chapel Hill Expert Survey(3), equals r = 0.986, p < 10 −5 , as displayed on figure S4.

Fig. S6 .Fig. S7 .Fig. S8 .
Fig. S6.(A)Gini coefficient of participants' friend published messages vs. Gini coefficient of participants' friend impressed messages.Only participants having seen more than 500 tweets (captured by the browser extension) during the consider period are displayed (B) Lorenz Curves associated to the publication and impression Gini coefficients, averaged over participants.
Fig. S9.Distribution of the political leaning of accounts: Followed by Participants (Black), Appearing in Participant Timelines as In-Network (Orange), Appearing in Participant Timelines as Out-of-Network (Blue), Appearing in Participant Timelines In and Out network combined (Green).We segment participant based on their political leaning.Filled area corresponds to standard error determined via bootstrap over participants.

( a )
Fig. S10.Network of Follow, visualized after a dimensional reduction through UMAP over node2vec embeddings.For clarity sake, only participants' friends and the 220k snowballs seeds are displayed.Clusted idenfied by HDSCBAN are colored, nodes in orange corresponds to outliers, belonging to no identified clusters.